Worksheet 1 - Communications scatter plot


This worksheet will walk you through everything you need to do to turn raw data into a finished plot. Along the way you’ll be:

  • making an RStudio project
  • adding your data to the project
  • making a script
  • installing R packages
  • writing code in your script, and
  • running that code

By the end of the worksheet you’ll have created the plot below from some raw data. Not the most exciting plot in the world, but making it showcases several key bits of code that you can reuse to make more fun stuff later!

The final plot

1 Make an RStudio project


1.1 Login to Posit Cloud

Go to https://posit.cloud/ and login. You’ll have to create an account if you don’t already have one.

When you’re logged in you should see a screen like this:

1.2 Make a new project

Click the New Project button and then New RStudio Project.

1.3 Name your project

While you’re waiting for the project to deploy, rename it by clicking Untitled Project and typing a new name.

1.4 Check it’s worked

If everything’s worked, you should see a screen like this:

2 Tweak RStudio settings


Every time you make a new RStudio project in Posit Cloud you’ll need to change some of the default settings! It’s annoying but will make life easier in the long run.

Click Tools > Global Options from the RStudio menu bar.

In the General > Basic tab:

Uncheck:

  • Restore .RData into workspace on startup

  • Always save history (even when not saving .RData)

Set Save workspace to .RData on exit to “Never”.

This tab should now look like this:

3 Add your data to the project


3.1 Download the dataset

This worksheet’s dataset is modified from one found at the World Bank.

Click the link below to download the dataset for this worksheet to somewhere on your hard drive (somewhere like your Desktop or Downloads folder will be fine).

3.2 Add the data to your project

Click on the Upload button in the RStudio Files pane.

In the popup window, click on the grey Browse button and select your data file from wherever you saved it in the previous step. Then click OK to upload it to your project.

Now you should see the data file in your RStudio Files pane.

4 Make a new script


4.1 Make a new file

Click on the New file button in RStudio and select R Script.

4.2 Save your file

Click the Save button. Don’t forget to do this as often as you can while you’re writing code in your script! You can also use Ctrl+S (Cmd+S on MacOS).

In the popup window, type a name for your script in the File name field and click Save. I almost always just call mine “script” unless I’m feeling extra imaginative.

Now you should see the script file in your RStudio Files pane.

Close your script by clicking the cross on its tab.

Open it again by left-clicking the script file in the RStudio Files pane.

Your script is where you’re going to write and save all of your code. At the end of this worksheet it will look something like this:

5 Using RStudio


There’s a lot going on in RStudio, and you may feel a bit overwhelmed at first. Don’t worry though, you’ll only ever need to use a few key bits.

Now you’ve loaded in your data and made a script, let’s look at the most useful parts of RStudio.

5.1 Your script

Up in the top-left pane is your script. If you have more than one script, you can open them all here, just like tabs in your internet browser. Other stuff opens here in tabs sometimes - don’t worry about that for now though!

5.2 The console

Down in the bottom-left pane is the console. This is where all R code gets run, and where you’ll see things like error messages, warnings, or information.

When you run code from your script, it will get run in the console. You can also type code directly into the console and run it.

5.3 The Environment Pane

In the top-right is the Environment pane. Here is where things called “objects” appear. You’ll learn about these in a bit.

5.4 The Files Pane

In the bottom-right you’ll see several tabs, one of which is the Files pane.

Here you can see all the files associated with your current project (e.g. Excel files, scripts). You can upload any data files you need for your project here, as well as rename files and create folders - just like you can with a folder on your Desktop.

5.5 Other panes

In the bottom-right you’ll also use the Plots, Help, and Viewer tabs. You’ll get to know what these do later on.

6 Coding step-by-step


6.1 The goal

By the end of this section your script file should look like this:

# READ IN DATA FROM EXCEL FILE --------------------------------------------

data_raw <-
  readxl::read_xlsx(path = "communications.xlsx")



# FILTER OUR DATASET ------------------------------------------------------

data_filtered <-
  data_raw |>
  dplyr::filter(country_name %in% c("Italy", "Norway")) |>
  dplyr::filter(indicator_name != "Broadband subscriptions")



# MAKE PLOT ---------------------------------------------------------------

my_plot <-
  data_filtered |>
  ggplot2::ggplot(mapping = ggplot2::aes(x = year,
                                         y = value,
                                         colour = country_name)) +
  ggplot2::geom_point(size = 3,
                      alpha = 0.75) +
  ggplot2::scale_x_continuous(breaks = scales::breaks_width(width = 10)) +
  ggplot2::scale_colour_brewer(palette = "Dark2") +
  ggplot2::facet_wrap(facets = ggplot2::vars(indicator_name),
                      ncol = 1) +
  ggplot2::labs(x = "Year",
                y = "Subscriptions per person") +
  ggplot2::theme_classic() +
  ggplot2::theme(legend.title = ggplot2::element_blank(),
                 legend.position = "top")



# SAVE PLOT ---------------------------------------------------------------

ggplot2::ggsave(filename = "plot_mobile_landline.png",
                plot = my_plot,
                width = 110,
                height = 200,
                units = "mm",
                dpi = 300) # this argument changes the plot resolution

And running it should create a plot like the one you saw earlier!

We’re now going to build this script bit by bit so we can see how it all fits together.

6.2 Install a package

R doesn’t have a built-in way to read Excel files. We need to install a package called readxl to do it.

Packages

Packages are like plugins or add-ons that let us do useful things. We need to install them before we can use them.

To install the readxl package, type install.packages("readxl") at the prompt (the cursor just after the >) in the RStudio console and press Enter.

Apart from installing packages and other one-off commands, we won’t be typing much in the console at all. Instead we’ll type all our code in the script. That way we can save it so that we i) know exactly what we did to the data, and ii) can rerun it if our dataset gets updated.

6.3 Read in data from Excel file

Now we can use a function from the readxl package called read_xlsx to read in our dataset.

Functions

Functions are like verbs - they do things. For example, read_xlsx reads in Excel files, paste sticks together different bits of text, and mutate changes a column in a dataset.

Functions almost always need brackets after them, like this: read_xlsx().

In your script (not in the console!) write:

readxl::read_xlsx(path = "communications.xlsx") 

Now highlight that text and do Ctrl+Enter (or Cmd+Enter on MacOS). In the console you should see something like this:

# A tibble: 25,132 × 5
   country_name country_code indicator_name           year      value
   <chr>        <chr>        <chr>                   <dbl>      <dbl>
 1 Afghanistan  AFG          Broadband subscriptions  2004 0.00000849
 2 Afghanistan  AFG          Broadband subscriptions  2005 0.00000901
 3 Afghanistan  AFG          Broadband subscriptions  2006 0.0000197 
 4 Afghanistan  AFG          Broadband subscriptions  2007 0.0000193 
 5 Afghanistan  AFG          Broadband subscriptions  2008 0.0000189 
 6 Afghanistan  AFG          Broadband subscriptions  2009 0.0000365 
 7 Afghanistan  AFG          Broadband subscriptions  2010 0.0000532 
 8 Afghanistan  AFG          Broadband subscriptions  2012 0.0000492 
 9 Afghanistan  AFG          Broadband subscriptions  2013 0.0000476 
10 Afghanistan  AFG          Broadband subscriptions  2014 0.0000458 
# ℹ 25,122 more rows

If you can see something, you’ve read in your first dataset.

The output in the console tells you that it’s been read in as a tibble, and that it has 25,132 rows and 5 columns.

Tibbles

A tibble is just a funny name for a table - anything with rows and columns. Sometimes you’ll see it referred to as a dataframe. Don’t worry too much about it!

The five columns are called “country_name”, “country_code”, etc.

Below each column name is a little symbol which tells us what type of data are in that column. For example, the first column, marked as <chr>, contains character values. The last column, marked <dbl> contains doubles (basically a number that can have decimal points).

Finally, the output shows us what’s in the first 10 rows.

The code you’ve just run:

readxl::read_xlsx(path = "communications.xlsx") 

is made up of different parts.

The most important part is the function - read_xlsx.

Because the function is from a package, we need to tell R which package it’s from using the package name (readxl) and two colons (::).

The read_xlsx function needs an argument called ‘path’ so it knows what file we want it to read in. We pass the argument the name of our file as a text string (in quotes).

Function arguments

Functions often need us to give them arguments. This tells the function what to do or how to do something.

Some functions need a lot of arguments, some only need one, others don’t need any at all!

So we can do more things to our data, we now need to save it into an object using a little arrow thingy (also called an assignment operator) that combines a left chevron and a dash/minus sign to make <-.

Objects

An object is like a temporary file that R uses to store data. When you restart R it will be gone, but don’t worry, you can just run your code again to recreate it!

In your script add a line before your function like this:

data_raw <- 
  readxl::read_xlsx(path = "communications.xlsx")

Now highlight the text on both lines and do Ctrl+Enter.

This time you shouldn’t see any information appear in the console. Instead, you should see an object called “data_raw” has appeared in the RStudio Environment pane in the top right quarter of the screen.

This is because, instead of showing you the output of the code (the first line you wrote), it’s saved the output of the code into an object.

Again, you can see it has 25,132 rows (called obs. or observations here) and 5 columns (called variables here).

Click on the object name (not the blue arrow button!) in the Environment pane. You should see a spreadsheet-like tab appear. You can scroll around and sort columns, but you can’t edit any data here!

You’ve now created your first statement, which uses a function from a package to read in data from an Excel file and save it into a temporary object.

6.4 Filter out some rows

The dataset has a column called “country_name”. Let’s filter the data so we only have rows where this country column contains either “Italy” or “Norway”.

We’ll use the filter function from the dplyr package. Install the dplyr package first by running install.packages("dplyr") in the console.

We’ll also use something called a pipe, which is made of a vertical line (|) and a right chevron (>). The vertical line is normally somewhere in the bottom left of your keyboard near Left Ctrl.

The pipe

The pipe (|>) is like “and then” in English. It takes the output of the code just before it and then passes it to the code just after it.

We can take an object full of data and pass it to a function. We could then add another pipe and add another function - we’ll cover this in a minute!

A couple of lines after the first statement, type the following in your script, highlight it, and do Ctrl+Enter to run it:

data_raw |> 
  dplyr::filter(country_name %in% c("Italy", "Norway")) 

You should see that this code returns a tibble with only 268 rows!

# A tibble: 268 × 5
   country_name country_code indicator_name           year   value
   <chr>        <chr>        <chr>                   <dbl>   <dbl>
 1 Italy        ITA          Broadband subscriptions  2000 0.00202
 2 Italy        ITA          Broadband subscriptions  2001 0.00684
 3 Italy        ITA          Broadband subscriptions  2002 0.0149 
 4 Italy        ITA          Broadband subscriptions  2003 0.0392 
 5 Italy        ITA          Broadband subscriptions  2004 0.0817 
 6 Italy        ITA          Broadband subscriptions  2005 0.117  
 7 Italy        ITA          Broadband subscriptions  2006 0.145  
 8 Italy        ITA          Broadband subscriptions  2007 0.172  
 9 Italy        ITA          Broadband subscriptions  2008 0.190  
10 Italy        ITA          Broadband subscriptions  2009 0.203  
# ℹ 258 more rows

What you’ve done here is take the object called “data_raw” and used the pipe to pass it to the dplyr::filter function.

You might notice there’s another function in there called c. This is not from a package but is included in R. There’s also a funny looking %in% symbol. Don’t worry about this for now.

Now save the output of this statement into a new object called data_filtered, like this:

data_filtered <- 
  data_raw |>
  dplyr::filter(country_name %in% c("Italy", "Norway"))

Again, nothing appears in the console, but a new object should have appeared in the Environment pane.

6.5 Filter out some more rows

Now let’s just keep rows where the column called “indicator_name” does NOT contain the text “Broadband subscriptions”.

We can add another dplyr::filter using a second pipe, like this:

data_filtered <-
  data_raw |>
  dplyr::filter(country_name %in% c("Italy", "Norway")) |> 
  dplyr::filter(indicator_name != "Broadband subscriptions") 

Rerun these four lines to recreate the “data_filtered” object with this new code. You’ll see in the Environment pane it now only has 224 rows!

Filtering data

Here are some common ‘operators’ for filtering data in dplyr::filter.

== equals

!= not equal to

>, >= greater than, greater than or equal to

<, <= less than, less than or equal to

%in% contains

!your_column_name %in% doesn’t contain

You can sometimes use functions to filter data.

is.na(your_column_name) contains an NA value

!is.na(your_column_name) doesn’t contain an NA value

6.6 Make an empty plot

We’ll be using the ggplot2 package for most of this course. Install it with install.packages("ggplot2").

Now take the filtered data object and pipe it into the ggplot function from the ggplot2 package, like this, and run these two lines of code:

data_filtered |> 
  ggplot2::ggplot() 

You should now see this lovely plot appear in the RStudio Plots pane!

6.7 Choose x and y

Tell the ggplot function that we want the “year” column on the x axis and the “value” column on the y axis using the ‘mapping’ argument. The thing you’ll need to pass to the mapping argument is another function from ggplot2 called aes (this is short for ‘aesthetics’).

Add the mapping argument like this, and run these three lines of code:

data_filtered |>
  ggplot2::ggplot(mapping = ggplot2::aes(x = year, 
                                         y = value)) 

6.8 Add some points

After our initial ggplot function we can add layers and customise our plot by using a + to string together multiple functions from ggplot2.

Add a geom_point function like this to add points onto the chart, and run the statement:

data_filtered |>
  ggplot2::ggplot(mapping = ggplot2::aes(x = year,
                                         y = value)) +
  ggplot2::geom_point() 

6.9 Colour by country

Colour the points by the “country_name” column by adding a ‘colour’ (or ‘color’) argument inside the aes function, and run the statement. Don’t forget to use commas to separate the arguments!

data_filtered |>
  ggplot2::ggplot(mapping = ggplot2::aes(x = year,
                                         y = value,
                                         colour = country_name)) + 
  ggplot2::geom_point()

6.10 Add a custom palette

ggplot2 has used a default colour palette. Let’s add our own by using a + and adding another ggplot2 function called scale_colour_brewer.

Use a Brewer palette from https://colorbrewer2.org.

Copy the palette name from here (red arrow in screenshot) and paste it into the text string passed to the ‘palette’ argument of scale_colour_brewer:

I’ve used the “Dark2” palette.

data_filtered |>
  ggplot2::ggplot(mapping = ggplot2::aes(x = year,
                                         y = value,
                                         colour = country_name)) +
  ggplot2::geom_point() +
  ggplot2::scale_colour_brewer(palette = "Dark2") 

6.11 Add a theme

ggplot2 gives you some out-of-the-box themes to style the background, gridlines, and axes of your plot.

To use a custom theme just add another ggplot2 function like theme_classic on a new line. Don’t forget to add another + after the preceding ggplot2 function!

data_filtered |>
  ggplot2::ggplot(mapping = ggplot2::aes(x = year,
                                         y = value,
                                         colour = country_name)) +
  ggplot2::geom_point() +
  ggplot2::scale_colour_brewer(palette = "Dark2") +
  ggplot2::theme_classic() 

6.12 Remove the legend title

An out-of-the-box theme like theme_classic is a good starting point but we can add other changes as well. To do this we’ll add a theme function after the theme_classic line. Then, we’ll remove the title text from the legend by using the ‘legend.title’ argument to theme, and setting it to ggplot2::element_blank().

data_filtered |>
  ggplot2::ggplot(mapping = ggplot2::aes(x = year,
                                         y = value,
                                         colour = country_name)) +
  ggplot2::geom_point() +
  ggplot2::scale_colour_brewer(palette = "Dark2") +
  ggplot2::theme_classic() +
  ggplot2::theme(legend.title = ggplot2::element_blank()) 

6.13 Move legend to the top

The theme function can take lots of arguments. Run ?ggplot2::theme in the console to take a look at the help docs for the function.

Let’s use the ‘legend.position’ argument to theme to move the legend to the top, like this:

data_filtered |>
  ggplot2::ggplot(mapping = ggplot2::aes(x = year,
                                         y = value,
                                         colour = country_name)) +
  ggplot2::geom_point() +
  ggplot2::scale_colour_brewer(palette = "Dark2") +
  ggplot2::theme_classic() +
  ggplot2::theme(legend.title = ggplot2::element_blank(),
                 legend.position = "top") 

6.14 Change the axis titles

By default, ggplot2 uses the column names from the aes function to make axis titles. Often they don’t look great or need some extra information.

Let’s add the labs function from ggplot2 to override these titles with something more meaningful.

We’ll put labs somewhere before theme, but after geom_point. The order we stack ggplot2 functions can be important, but not always. You’ll have to experiment to see what happens when you change things around.

data_filtered |>
  ggplot2::ggplot(mapping = ggplot2::aes(x = year,
                                         y = value,
                                         colour = country_name)) +
  ggplot2::geom_point() +
  ggplot2::scale_colour_brewer(palette = "Dark2") +
  ggplot2::labs(x = "Year", 
                y = "Subscriptions per person") + 
  ggplot2::theme_classic() +
  ggplot2::theme(legend.title = ggplot2::element_blank(),
                 legend.position = "top")

6.15 Make the points fancier

We changed the colour of the points inside aes because we wanted it to be controlled by a column in our data. If we want to change all the points at once, we can do so outside the aes and in the actual geom_point function.

Let’s change the ‘size’ and the ‘alpha’ (transparency) arguments to make the points larger and more see-through, like this:

data_filtered |>
  ggplot2::ggplot(mapping = ggplot2::aes(x = year,
                                         y = value,
                                         colour = country_name)) +
  ggplot2::geom_point(size = 3, 
                      alpha = 0.75) + 
  ggplot2::scale_colour_brewer(palette = "Dark2") +
  ggplot2::labs(x = "Year",
                y = "Subscriptions per person") +
  ggplot2::theme_classic() +
  ggplot2::theme(legend.title = ggplot2::element_blank(),
                 legend.position = "top")

6.16 Change the x axis breaks

The x axis automatically has a tick every 20 years. If we want to override this we need to add a scale_x_continuous function. Then we can set the ‘breaks’ argument using a value created by the breaks_width function from the scales package.

We didn’t have to install the scales package because it gets installed when we install ggplot2!

Here you can see we set the width between each break on the x axis to be 10.

data_filtered |>
  ggplot2::ggplot(mapping = ggplot2::aes(x = year,
                                         y = value,
                                         colour = country_name)) +
  ggplot2::geom_point(size = 3,
                      alpha = 0.75) +
  ggplot2::scale_x_continuous(breaks = scales::breaks_width(width = 10)) + 
  ggplot2::scale_colour_brewer(palette = "Dark2") +
  ggplot2::labs(x = "Year",
                y = "Subscriptions per person") +
  ggplot2::theme_classic() +
  ggplot2::theme(legend.title = ggplot2::element_blank(),
                 legend.position = "top")

6.17 Add a facet

A very powerful tool for making plots in ggplot2 is faceting. We can use facet_wrap to make one small plot for each value in an column (like “indicator_name”).

Inside facet_wrap we can pass the ‘facets’ argument a column name ‘wrapped’ in the ggplot2::vars function (don’t ask me why!), like this:

data_filtered |>
  ggplot2::ggplot(mapping = ggplot2::aes(x = year,
                                         y = value,
                                         colour = country_name)) +
  ggplot2::geom_point(size = 3,
                      alpha = 0.75) +
  ggplot2::scale_x_continuous(breaks = scales::breaks_width(width = 10)) +
  ggplot2::scale_colour_brewer(palette = "Dark2") +
  ggplot2::facet_wrap(facets = ggplot2::vars(indicator_name)) + 
  ggplot2::labs(x = "Year",
                y = "Subscriptions per person") +
  ggplot2::theme_classic() +
  ggplot2::theme(legend.title = ggplot2::element_blank(),
                 legend.position = "top")

6.18 Make facets in a column

We can set the facets to stack into one column by setting the ‘ncol’ argument to 1.

data_filtered |>
  ggplot2::ggplot(mapping = ggplot2::aes(x = year,
                                         y = value,
                                         colour = country_name)) +
  ggplot2::geom_point(size = 3,
                      alpha = 0.75) +
  ggplot2::scale_x_continuous(breaks = scales::breaks_width(width = 10)) +
  ggplot2::scale_colour_brewer(palette = "Dark2") +
  ggplot2::facet_wrap(facets = ggplot2::vars(indicator_name),
                      ncol = 1) + 
  ggplot2::labs(x = "Year",
                y = "Subscriptions per person") +
  ggplot2::theme_classic() +
  ggplot2::theme(legend.title = ggplot2::element_blank(),
                 legend.position = "top")

6.19 Save the plot into an object

When we run our code, our plot gets previewed in the RStudio Plots pane. Great for experimenting and making a plot, but now we’ve finished we’ll want to save it to a file.

First we must save out plot into an object, just like we did in the first and second statements we wrote. Add a line above our plot code to do this. I’ve called the object “my_plot” because I’ve no imagination. Sometimes no imagination is safer!

my_plot <- 
  data_filtered |>
  ggplot2::ggplot(mapping = ggplot2::aes(x = year,
                                         y = value,
                                         colour = country_name)) +
  ggplot2::geom_point(size = 3,
                      alpha = 0.75) +
  ggplot2::scale_x_continuous(breaks = scales::breaks_width(width = 10)) +
  ggplot2::scale_colour_brewer(palette = "Dark2") +
  ggplot2::facet_wrap(facets = ggplot2::vars(indicator_name),
                      ncol = 1) +
  ggplot2::labs(x = "Year",
                y = "Subscriptions per person") +
  ggplot2::theme_classic() +
  ggplot2::theme(legend.title = ggplot2::element_blank(),
                 legend.position = "top")

6.20 Where’s my plot gone?

Now when we run this statement we won’t get a preview in the Plots pane!

To demonstrate this, click the brush icon in the Plots pane to remove all plot previews and then rerun your code statement. If you’ve done it right, the Plots pane will remain empty!

Instead, a new object has been created. We can see this new object has appeared in the Environment pane. However, clicking on it will just give us some complicated nonsense - don’t do that unless you like that kind of thing!

If we want to preview our plot now it’s been saved into an object, we must run the object in the console!

To do this highlight the object name in the script (I double-click it) and do Ctrl+Enter.

Now you should see your chart in the Plots pane again!

6.21 Save the plot into a file

Now our plot is in an object we can save it into a file using the ggsave function from ggplot2.

This time we don’t want to stick it onto the stack of ggplot2 functions with another +. Instead, ggsave stands alone as it’s own statement.

And, unlike other statements we don’t assign the output into an object with a <-!

Set the ‘filename’ argument of the ggsave function to the the name of the file you want to save the plot as. I almost always use a png file, as it’s high resolution for a very small file size.

Set the ‘plot’ argument to be the name of your plot object.

I also set the ‘height’ and ‘width’ arguments, along with ‘units’. These three together control the dimensions of the file we’ll create. I normally have to use trial and error to find the best dimensions for each plot.

Finally, you can set the dpi (dots per inch) to control the resolution. I often use 300, 600, or 1200. A dpi of 1200 is the highest resolution, but takes longer to save, and makes a bigger file. I normally just use 300 unless I really need more.

ggplot2::ggsave(filename = "plot_mobile_landline.png", 
                plot = my_plot, 
                width = 110, 
                height = 200, 
                units = "mm", 
                dpi = 300) 

6.22 Add a comment

You can add comments to your code to help other people (or you in the future!) understand what your code is doing.

You can turn a whole line or multiple lines into comments by selecting them and then doing Ctrl+Shift+C. Use this shortcut again to uncomment the lines.

Comments

A comment is any text after a # symbol. A comment can be on a new line, or on a line after some code.

R will ignore any comments when you run your code.

Add a comment after a line of code, like this:

ggplot2::ggsave(filename = "plot_mobile_landline.png",
                plot = my_plot,
                width = 110,
                height = 200,
                units = "mm",
                dpi = 300) # this argument changes the plot resolution 

6.23 Add some section headers

You can add a special type of comment called a section header by doing Ctrl+Shift+R on a blank line and typing a section title in the popup box. I tend to use CAPS for these, but you don’t have to!

Add a section header above the statement where your script reads in the raw data, like this:

# READ IN DATA FROM EXCEL FILE -------------------------------------------- 

data_raw <-
  readxl::read_xlsx(path = "communications.xlsx")

Now add section headers in front of the parts of your code that filter data, create the plot, and save the plot, like this:

# FILTER OUR DATASET ------------------------------------------------------ 

data_filtered <-
  data_raw |>
  dplyr::filter(country_name %in% c("Italy", "Norway")) |>
  dplyr::filter(indicator_name != "Broadband subscriptions")



# MAKE PLOT --------------------------------------------------------------- 

my_plot <-
  data_filtered |>
  ggplot2::ggplot(mapping = ggplot2::aes(x = year,
                                         y = value,
                                         colour = country_name)) +
  ggplot2::geom_point(size = 3,
                      alpha = 0.75) +
  ggplot2::scale_x_continuous(breaks = scales::breaks_width(width = 10)) +
  ggplot2::scale_colour_brewer(palette = "Dark2") +
  ggplot2::facet_wrap(facets = ggplot2::vars(indicator_name),
                      ncol = 1) +
  ggplot2::labs(x = "Year",
                y = "Subscriptions per person") +
  ggplot2::theme_classic() +
  ggplot2::theme(legend.title = ggplot2::element_blank(),
                 legend.position = "top")



# SAVE PLOT --------------------------------------------------------------- 

ggplot2::ggsave(filename = "plot_mobile_landline.png",
                plot = my_plot,
                width = 110,
                height = 200,
                units = "mm",
                dpi = 300) # this argument changes the plot resolution

6.24 Test the entire script

To make sure everything you’ve written in your script works together and is reproducible, first delete the png file from the RStudio Files pane by selecting it with the checkbox and clicking the Delete file button.

Then restart R with Ctrl+Shift+F10 or Session > Restart R R from the RStudio menu.

This should remove all objects from your Environment pane, giving you a blank slate to test what you’ve written, like this:

Now, highlight all your code and run it. Check the console for errors, and check the RStudio Files pane for the newly created plot.

If you’ve got no errors in the console it will look something like this, with the code you’ve just run and then a new > and cursor ready for the next command.

Nice work!

7 The final script


# READ IN DATA FROM EXCEL FILE --------------------------------------------

data_raw <-
  readxl::read_xlsx(path = "communications.xlsx")



# FILTER OUR DATASET ------------------------------------------------------

data_filtered <-
  data_raw |>
  dplyr::filter(country_name %in% c("Italy", "Norway")) |>
  dplyr::filter(indicator_name != "Broadband subscriptions")



# MAKE PLOT ---------------------------------------------------------------

my_plot <-
  data_filtered |>
  ggplot2::ggplot(mapping = ggplot2::aes(x = year,
                                         y = value,
                                         colour = country_name)) +
  ggplot2::geom_point(size = 3,
                      alpha = 0.75) +
  ggplot2::scale_x_continuous(breaks = scales::breaks_width(width = 10)) +
  ggplot2::scale_colour_brewer(palette = "Dark2") +
  ggplot2::facet_wrap(facets = ggplot2::vars(indicator_name),
                      ncol = 1) +
  ggplot2::labs(x = "Year",
                y = "Subscriptions per person") +
  ggplot2::theme_classic() +
  ggplot2::theme(legend.title = ggplot2::element_blank(),
                 legend.position = "top")



# SAVE PLOT ---------------------------------------------------------------

ggplot2::ggsave(filename = "plot_mobile_landline.png",
                plot = my_plot,
                width = 110,
                height = 200,
                units = "mm",
                dpi = 300) # this argument changes the plot resolution